fix modules coredump due to data race in shm_manager#87
Open
csstormq wants to merge 1 commit intoApolloAuto:masterfrom
csstormq:dev
Open
fix modules coredump due to data race in shm_manager#87csstormq wants to merge 1 commit intoApolloAuto:masterfrom csstormq:dev
csstormq wants to merge 1 commit intoApolloAuto:masterfrom
csstormq:dev
Conversation
bjtulynn
approved these changes
Aug 2, 2019
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
localization模块由于异常导致进程崩溃,其所产生coredump堆栈信息为:
原因分析
调用get_publisher_links()函数时,会调用std::vector的拷贝构造函数,将publisher_links_中的元素拷贝到临时对象中。该函数的可能执行步骤为:1)调用std::vector基类的构造函数——即_Base(__x.size()),如果此时publisher_links_的元素数量为0,那么基类会将_M_start、M_finish和_M_end_of_storage都初始化为0;2)基类构造函数执行完后,会调用stl_vector.h:313行的代码对publisher_links_中的元素进行拷贝操作,如果在调用之前publisher_links_新增了一个元素,那么publisher_links.M_start和publisher_links._M_finish的值会得到更新,并且它们之间的差是0x10(通过sizeof函数可以得出publisher_links_中一个元素占用16字节的存储空间)。那么当调用std::__uninitialized_copy_a()函数时,会构建一个类型为boost::shared_ptr的元素,并返回__cur(值为0x10)赋值给临时对象的数据成员_M_finish。3)临时对象析构时,会释放内存地址从_M_start(值为0x0)到_M_finish(值为0x10)的内存,从而导致localization模块进程崩溃。此执行顺序的可能性通过gdb调试测试程序时也得到了验证,可以得到与localization模块几乎完全相同的coredump,具体调试过程见“gdb调试测试程序”。
gdb调试测试程序
main.cpp:
shm_manager_sim.h:
shm_manager_sim.cpp:
topic_manager_sim.h:
topic_manager_sim.cpp:
subscription_sim.h:
subscription_sim.cpp:
README.md: